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Abstract. The state complexity of a regular language is the number of 
states in the minimal deterministic automaton accepting the language. 
The syntactic complexity of a regular language is the cardinality of its 
syntactic semigroup. The syntactic complexity of a subclass of regular 
languages is the worst-case syntactic complexity taken as a function of 
the state complexity n of languages in that class. We study the syntac- 
tic complexity of the class of regular ideal languages and their comple- 
ments, the closed languages. We prove that n" _1 is a tight upper bound 
on the complexity of right ideals and prefix-closed languages, and that 
there exist left ideals and suffix-closed languages of syntactic complex- 
ity n n_1 + n — 1, and two-sided ideals and factor-closed languages of 
syntactic complexity n n ~ 2 + (n — 2)2 n ~ 2 + 1. 

Keywords: automaton, closed, complexity, ideal, language, monoid, reg- 
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1 Introduction 

There are two fundamental congruence relations in the theory of regular lan- 
guages: the Nerode congruence [20], and the Myhill congruence [19]. In both 
cases, a language is regular if and only if it is a union of congruence classes 
of a congruence of finite index. The Nerode congruence leads to the definitions 
of left quotients of a language and the minimal deterministic finite automaton 
recognizing the language. The Myhill congruence leads to the definitions of the 
syntactic semigroup and the syntactic monoid of the language. 

The state complexity of a language is defined as the number of states in the 
minimal deterministic automaton recognizing the language. This concept has 
been studied quite extensively: for surveys of this topic and lists of references 
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we refer the reader to [3,31]. On the other hand, in spite of suggestions that 
syntactic semigroups deserve to be studied further [14,17], relatively little has 
been done on the "syntactic complexity" of a regular language, which we define 
as the cardinality of the syntactic semigroup of the language. This semigroup is 
isomorphic to the semigroup of transformations of the set of states of the minimal 
deterministic automaton recognizing the language, where these transformations 
are performed by non-empty words. 

The following example illustrates the significant difference between state com- 
plexity and syntactic complexity. 

Example 1. The deterministic automata in Fig. 1 have the same alphabet, are all 
minimal, and have the same state complexity. However, the syntactic complexity 
of A\ is 3, that of Ai is 9, and that of ^3 is 27. 




a a,b,c b 

Ai A2 A3 



Fig. 1. Automata with various syntactic complexities. 

Syntactic complexity provides an alternative measure for the complexity of 
a regular language. The following question then arises: 

Is it possible to find upper bounds to the syntactic complexity of a 
regular language from its properties or from the properties of its minimal 
deterministic automaton? 

We shed some light on this question for ideal and closed regular languages. 
2 Background 

This section provides a brief informal overview of the past work related to the 
topic of this paper. The relevant concepts will be formally defined later. 

In 1938, Piccard [22] proved that two generators are sufficient to generate 
the set of all permutations of a set of n elements, that is, the symmetric group of 
degree n. The two generators can be a cyclic permutation of all the elements and 
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a transposition of two of the elements. References to her other early papers can 
be found in her books [23,24], published in 1946, and 1957, where the problem 
of generators of groups is treated in detail. 

In 1960, 1962, and 1963 Salomaa [26-28] studied, among other problems, the 
sets that can generate the set of all transformations of a set of n elements. In 
particular, his aim was to replace the symmetric group of degree n by smaller 
groups of degree n. 

In 1968, Denes [8] proved that three transformations are sufficient to generate 
the set of all transformations of a set S n of n elements. One can use the two 
transformations that generate the symmetric group of degree n and an additional 
transformation that maps each of n— 1 elements of a subset S n -i of S n to itself, 
and the last element to some element of S n -\. Moreover, he showed that fewer 
than three generators are not possible. A summary of other work by Denes on 
transformations can be found in [9] . 

In 1970, Maslov [18] dealt with the problem of generators of the semigroup 
of all transformations in the setting of finite automata. He pointed out that a 
certain ternary automaton with n states has n n transformations. He also stated 
without proof that it is not possible to reach this bound with a binary automaton, 
and that the precise bound for the binary case is not known. He exhibited a 
binary automaton with n states that has at least (n — l)™ -1 transformations. 

In 2002-2004, Holzer and Konig [12-14] studied the syntactic complexity 
of automata. They remarked that the syntactic complexity of a unary regular 
language of state complexity n is at most n, and this bound can be met. They 
also noted that n n is a tight bound on the complexity of languages over alphabets 
E with 1 17 1 > 3. Their main contributions are in the most difficult case, that of 
a binary alphabet. They proved that, for n J? 3, the function n n — n\ + g(n) is 
an upper bound to the syntactic complexity of a binary regular language, where 
g(n) is the Landau function. They also showed that a syntactic complexity of 
n ra (l — 2/y/n) can be achieved. For any prime n ^ 7, they characterized a 2- 
generator semigroup of maximal complexity. 

In 2003, Salomaa [29] considered all the words over the alphabet S of a finite 
automaton that perform the same transformation t. In particular, he defined the 
length of the shortest such sequence to be the depth with respect to S of the 
transformation t. The depth of t was then defined as the maximum over all £ 
that produce t. Finally, he defined the complete depth of a transformation to be 
its depth when £ ranges over all alphabets that generate all the transformations. 
Many properties of the depth functions are established in this paper. 

In 2003 and 2005, Krawetz, Lawrence and Shallit [17] studied the state com- 
plexity of the operation root(L) = {w <G S* \ 3n ^ 1 such that w n <G L}, which 
is bounded from above by n n , where n is the state complexity of L. In fact, they 
showed that a finite automaton with at most n" states can be constructed to 
accept root(L), and obtained a lower bound on the state complexity of root(L) 
for binary L. For alphabets of at least three letters, they showed that the bound 
on the state complexity of root(L) can be improved to n n — Q) • 
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3 Ideal and Closed Languages 



If w = uxv for some u,v,x G S*, then u is a prefix of w, v is a suffix of to, and 
a; is a factor of w. A prefix or suffix of w is also a factor of w. 

A language i is prefix-convex [I] if u,w e I with u a prefix of w implies 
that every word v must also be in L if u is a prefix of w and v is a prefix of 
w. It is prefix-closed if tc e L implies that every prefix of w is also in L. In 
the same way, we define suffix-convex and factor-convex, and the corresponding 
closed versions. 

A language L C .27* is a rig/ii idea/ (respectively, ie/C idea/, two-sided ideal) 
if it is non-empty and satisfies L = LZ 1 * (respectively, i = E*L, L = E*LS*). 
We refer to all three types as ideal languages or simply ideals. 

Suffix-closed languages were studied in 1974 by Gill and Kou [11], in 1976 
by Galil and Simon [10], in 1979 by Vcloso and Gill [30], in 2001 by Holzer, 
K. Salomaa, and Yu [15], in 2009 by Kao, Rampersad, and Shallit [16] and by 
Ang and Brzozowski [1], and in 2010 by Brzozowski, Jiraskova and Zou [6]. 

Left and right ideals were studied by Paz and Peleg [21] in 1965 under the 
names "ultimate definite" and "reverse ultimate definite events" . Complexity 
issues of conversion of nondctcrministic finite automata to deterministic finite 
automata in right, left, and two-sided ideals were studied in 2008 by Bordihn, 
Holzer, and Kutrib [2]. The closure properties of ideals were analyzed in [1]. 
Decision problems for various classes of convex languages, including ideals, were 
addressed by Brzozowski, Shallit and Xu in [7] . 

4 Transformations 

A transformation of a set Q is a mapping of Q into itself, whereas a permutation 
of Q is a mapping of Q onto itself. In this paper we consider only transformations 
of finite sets, and we assume without loss of generality that Q = {0,1, ... ,n— 1}. 
An arbitrary transformation has the form 



where ik € Q for k ^ n — 1. To simplify the notation, such a transformation 
will often be denoted by t : [io, i\, . . . , i„_2, i n -i], or just [io, ii, . . . , i n -2,i n -i] if 
t is understood. The identity transformation is the mapping 



/0 1 ■ - ■ i — 1 i i + 1 •• • i + k- 2 i + k- 1 i + k ■ ■■ n- 2 n- l\ 
1^0 1 ••• i- 1 i + 1 i + 2 ••• i + k- 1 i i + k ■ ■ ■ n - 2 n - 1 J ' 





We will consider cycles of length k of the following form: 
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where we show in bold type the elements that are changed by t. To simplify the 
notation, such a cycle is represented by + . . . ,i + k— 1). A cycle of length 1 
is the identity. A singular transformation is a transformation of the form 

/0 1 • •• i-lH + ln-2n-l\ 
^0 1 ••• i-lji + ln-2n-l) ' 

which is denoted by (*.). The singular transformation (*) is the identity. For 
i < j, a transposition is a transformation of the form 

/ 1 • • • i - 1 i i + 1 • • • j - 1 j j + 1 • • • n - 2 n - 1 \ 

\0l■■■i-^j^ + l■■■j-^ij + ^■■■n-2n-^J , 

which is denoted by with being the identity. A transposition is also 

a cycle of length 2. 

A constant transformation is a transformation of the form 

/01...n-2n-l\ 
y i i ■ ■ ■ i i J 

and it is denoted by (9) . 

The set of all n n transformations of a set Q is a monoid under composition of 
transformations, with identity as the unit clement. The set of all n\ permutations 
of Q is a group, the symmetric group of degree n. The following facts about 
generators of particular semigroups are well-known: 

Theorem 1 (Permutations). The symmetric group S n of size n\ can be gen- 
erated by any cyclic permutation of n elements together with an arbitrary trans- 
position. In particular, S n can be generated by c — (0, 1, . . . , n— 1) and t = (0,1). 

Theorem 2 (Transformations). The complete transformation monoid T n of 
size n n can be generated by any cyclic permutation of n elements together with 
a transposition and a "returning" transformation r = ("q 1 )- In particular, T n 
can be generated by c = (0, 1, . . . , n — 1), t = (0, 1) and r = ("~ 1 ) . 

5 Quotient Complexity and Syntactic Complexity 

If S is a non-empty finite alphabet, then S* is the free monoid generated by 
S, and U + is the free semigroup generated by S. A word is any element of £*, 
and the empty word is e. The length of a word w € S* is |ty|. A language over 
S is any subset of £*. The left quotient, or simply quotient, of a language L by 
a word w is the language L w — {x e S* \ wx e L}. 

An equivalence relation ~ on U* is a left congruence if, for all x, y G S* , 

x <~ y ux ~ uy, for all ueX*. (1) 

It is a right congruence if, for all x,y e 17*, 

x ~ y <^ ~ yw, for all w G 17*. (2) 
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It is a congruence if it is both a left and a right congruence. Equivalently, <~ is 
a congruence if 

x ~ y uxv <~ uyv, for all u,v E E* . (3) 
For any language L C Z* , define the N erode congruence [20] Hz of L by 

x — 2/ if and only if id G L <^=> yv G L, for all u, v G 17*. (4) 

Evidently, = L y if and only if x Hz y. Thus, each equivalence class of this 
congruence corresponds to a distinct quotient of L. 

For any language L Q E* , define the Myhill congruence [19] Hi of L by 

.t Hi y if and only if uxv eL<» wyu G L for all m, w G E* . (5) 

This congruence is also known as the syntactic congruence of L. The semigroup 
E + 1 Hz of equivalence classes of the relation Hz, , is the syntactic semigroup 
of L, and E* / Hz is the syntactic monoid of L. The syntactic complexity a(L) 
of L is the cardinality of its syntactic semigroup. The monoid complexity /i(L) 
of L is the cardinality of its syntactic monoid. If the equivalence containing e is a 
singleton in the syntactic monoid, then a(L) = /j,(L) — l; otherwise, a(L) = n(L). 

A (deterministic) semiautomaton is a triple, S = (Q,S,S), where Q is a 
finite, non-empty set of states, £ is a finite non-empty alphabet, and S : Q x 
E H Q is the transition function. A deterministic finite automaton or simply 
automaton is a quintuple A = (Q, S, S, q , F), where Q, E, and 8 arc as defined in 
the semiautomaton S = (Q, E, 5), qo G Q is the initial state, and F C Q is the set 
of /mcJ states. A nondeterministic finite automaton or simply nondeterministic 
automaton is a quintuple A/" = (Q, Z 1 , 77, S, F), where Q, Z 1 , and F are as defined 
in a deterministic automaton, S C Q is the set 0/ initial states, and 77 : Q x Z 1 H 
2*5 is the transition function. 

The e-function L £ of a regular language L is L e = if e ^ L; L £ = e if e G L. 
The quotient automaton of a regular language L is A — (Q, E ,S,q , F), where 
Q = {L w I w G E*}, S(L w ,a) = L wa , q a — L E = L, F = {L w \ L E W = e}, and 
L e w = (L w ) e . The number of states in the quotient automaton of L is the quotient 
complexity of L. The quotient complexity is the same as the state complexity, 
but there are advantages to using quotients [3]. A quotient automaton can be 
conveniently represented by quotient equations [4]: 

L w = [J aL wa U L £ w , (6) 

where there is one equation for each distinct quotient L w . 

In terms of automata, each equivalence class [w] -> L of Hz, is the set of all 
words w that take the automaton to the same state from the initial state. In 
terms of quotients, it is the set of words w that can all be followed by the same 
quotient L w . 

In terms of automata, each equivalence class [w] ^ L of the syntactic congru- 
ence is the set of all words that perform the same transformation on the set of 
states. 
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The transformation semigroup (respectively, transformation monoid) of an 
automaton is the set of transformations performed by words of E + (respectively, 
E*) on the set of states. The transformation semigroup (monoid) of the quotient 
automaton of L is isomorphic to the syntactic semigroup (monoid) of L. 

Proposition 1. For any language L with k(L) = n, we have n—1 ^ cr(L) ^ n n . 

Proof. Since every state other than the initial state has to be reachable from the 
initial state by a non-empty word, there must be at least n—1 transformations. 
If E — {a} and L = a" _1 a*, then k{L) = n, and a(L) = n — 1. Thus the 
lower bound n — 1 is achievable. It is evident that n n is an upper bound, and by 
Theorem 2 this upper bound is achievable if \E\ > 3. □ 

If one of the quotients of L is (respectively, e, E* , E + ), then we say that 
L has (respectively, e, E*, E + ). A quotient L w of a language L is uniquely 
reachable (ur) [3] if L x = L w implies that x = w. If L wa is uniquely reachable 
for a € E, then so is L w . Thus, if L has a uniquely reachable quotient, then L 
itself is uniquely reachable by the empty word, i.e., the minimal automaton of 
L is non-returning. 

Theorem 3 (Special Quotients). Let L be any language with k(L) = n. 

1. IfL has or E* , then a(L) < n™" 1 . 

2. IfL has e or E+ , then a{L) < n™" 2 . 

3. If L is uniquely reachable, then o~(L) (n — l) n . 

4- If L a is uniquely reachable for some a € E , then cr(L) ^ 1 + (n — 2)". 
Moreover, these effects are cumulative as shown in Table 1. 



Table 1. Upper bounds on syntactic complexity for languages with special quotients. 
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Proof. 1. Since a = for all a e Z", there are only n — 1 states in the quotient 
automaton with which one can distinguish two transformations. Hence there are 
at most n" _1 such transformations. If L has E* , then E* = E* , for all a e E, 
and the same argument applies. 
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2. Since e a = for all a € S, L has if L has e. Now there are two states 
that do not contribute to distinguishing among different transformations. Dually, 

= S* for all a G S, and the same argument applies. 

3. If L is uniquely reachable then L w = L implies w — e. Thus L does not 
appear as a result of any transformation by a word in S + , and there remain 
only n — 1 choices for each of the n states. 

4. If L is uniquely reachable, then so is L. Hence L never appears as a result 
of a transformation by a word in S + , and L a appears only in one transformation. 
Therefore there can be at most (n — 2)™ other transformations. □ 

6 Right Ideals and Prefix-Closed Languages 

In this section we characterize the syntactic complexity of right ideals. The 
automaton defined below plays an important role in this theory. 

Definition 1. For n ^ A, define the automaton 

A n = ({0, 1, . . . , n - 1}, {a, 6, c, al} 7 6, 0, {n - 1}), 

where a = (0, 1, . . . , n — 2), b = (0, 1), c = (" 2 ), and d = ("ij. r/ie transition 
function S is then defined using these transformations. The automaton so defined 
accepts a right ideal and is minimal; it is depicted in Fig. 2. 

c,d c,d b,c,d b,c,d b a,b,c,d 

f \ a, 6 /" \ a f \ a 

\ L—x 1 i \ 2 



Fig. 2. Automaton „4 n of a right ideal with n" 1 transformations. 



Theorem 4 (Right Ideals and Prefix-Closed Languages)). Let L C S* 

have quotient complexity n. If L is a right ideal or a prefix-closed language, then 
the syntactic complexity of L is less than or equal to n n ~ 1 . Moreover, the bound 
is tight for n = 1 if \S\ ^ 1, for n = 2 if \S\ > 2, for n = 3 if \S\ > 3, and for 
n^4if\S\^ 4. 

Proof. Since every prefix-closed language other than S* is the complement of 
a right ideal, and complementation preserves syntactic complexity, it suffices to 
consider only right ideals. 
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If L is a right ideal, then it has 27* as a quotient. By Theorem 3, we have 
a(L) < n"" 1 . 

Next we prove that the language L — L(A n ) accepted by the automaton of 
Fig. 2 meets this bound. Consider any transformation t of the form 

/0 1 2 ••• n-3n-2n-l\ 
\ io *i *2 ■ ■ ■ ««-3 in-2n — ly' 

where Zfc € {0, 1, . . . , n — 1} for ^ fc ^ n — 2. There are two cases: 

1. Suppose ik =/= n — 1 for all fc, ^ fc ^ n — 2. By Theorem 2, since all the 
images of the first n—1 states are in the set {0, 1, . . . , n — 2}, transformation 
t can be performed by A n . 

2. If ih = n — 1 for some ft, ^ ft. ^ n — 2, then by the pigeon-hole principle, 
there exists some j, ^ j • ^ n — 2 such that 7^ j for all k, $J fc ^ n — 2. 

Define for all ^ k ^ n — 2 as follows: 

., _ ( j, if i k = n- 1; 

Then let 

s _/0 1 2 3---n-3n-2n-l\ 

V *0 *2 *3 ' ' ' *n-3 *n-2 n — ^ / ' 

Also, let r = (j, n — 2). Since all the images of the first n—1 states in s and r 
are in the set {0, 1, . . . , n — 2}, by Theorem 2, s and r can be performed by A n - 

We show now that t — srdr, which implies that t can also be performed 
by A n - If t maps k to n — 1, then s maps fc to j, r maps j to n — 2, (i maps n — 2 
to n — 1, and r maps n — 1 to n — 1. If i maps fc to n — 2, then s maps to 
n — 2, r maps n — 2 to j, <i maps j to j, and r maps j to n — 2. If f maps A; to 
ifc < n — 2, then so does srdr. Hence in all cases the mapping performed by t is 
the same as that of srdr. 

Since there are n n ~ x transformations like t, L(A n ) meets the bound. 

Now we consider the values n < 5. The bounds claimed below have all been 
verified by a computer program. 

n=l: There is only one type of right ideal with n = 1, namely L = S* , and its 

syntactic complexity is a (L) = 1. Thus the bound 1° = 1 is tight for \E\ ^ 1. 
n=2: If 1 17 1 = 1, there is only one right ideal, L = aa* , and <j(L) = 1. 

If \E\ = 2, then b*a(a + b)* meets the bound 2 1 = 2 of the theorem. 
n=3: If \U\ = 1, there is only one right ideal, L — aaa*, and a(L) = 2. 

For n = 3, inputs a and 6 of the automaton of Fig. 2 coincide. 

If \S\ — 2, we have verified that <r(L) < 7 for all right ideals, and the 

language of ^3 restricted to input alphabet {a, d} meets the bound 7. 

If 1 27 1 = 3, then the language of A3 restricted to input alphabet {a,c,d} 

meets the bound 3 2 = 9 of the theorem. 
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n= 4: If 1 17 1 = 1, there is only one right ideal, L = aaaa* , and a{L) — 3. 

For \S\ = 2, we have verified that cr(L) < 31 for all right ideals L. The 
bound is reached with the inputs a : [1, 2, 0, 3] and b : [1, 0, 3, 3]. 
For \S\ =3, we have verified that a(L) < 61 for all right ideals L, and A4 
restricted to input alphabet {a, c, d} meets this bound. 

n=5: For \S\ = 2, we have verified that <r(L) ^ 167 for all right ideals L. 
The bound is reached with the inputs a : [0,1,0,2,4] and b : [1,3,2,4,4], 
or with a : [0,0,1,2,4] and b : [2,3,0,4.4]. For \S\ = 3, we have verified 
that u(L) ^ 545 for all right ideals. The bound is reached with the inputs 
a: [0,0,1,3,4], b : [2, 0, 3, 1, 4], and c : [3,1,2,4,4]. □ 

Table 2 summarizes our result for right ideals. All the numbers shown are 
tight upper bounds. In general, there are many solutions with the same com- 
plexity. 



Table 2. Syntactic complexity bounds for right ideals. 
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n = 5 
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4 




n- 1 


\S\ = 2 
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7 


31 
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\Z\=3 






9 


61 


545 






\E\=A 








64 


625 




n- 1 



It is interesting to note that for our right ideal L with maximal syntactic 
complexity, the reverse language has maximal state complexity. Recall that the 
reverse w R of a word w is defined inductively by e R = e, (au) R — u R a. The 
reverse of a language L is L R — {w R | w € R}. It was shown in [5] that the reverse 
of a right ideal with n quotients has at most 2" _1 quotients, and that this bound 
can be met by a binary automaton. We now prove that automaton A' n , which 
is A n restricted to inputs a and d, is another example of a binary automaton 
that meets the 2"" 1 bound for reversal of right ideals. The nondctcrministic 
automaton J\f n obtained by reversing A' n is shown in Fig. 3. 

Theorem 5 (Reverse of Right Ideal). The reverse of the right ideal L{A' n ) 
has 2"" 1 quotients. 

Proof. Let Z be the set of words of the form w = d(aej-i)(aej-2) • • • (eia)(eo), 
where < j < n— 2, e {e, d} for 1 ^ i < j. In the subset construction applied 
to7V n , word dai reaches {n — 2 — j,n—l}. For 1 ^ i < j, the set of states reached 
by w includes state n — 2 — i if and only if e, = d. Thus each word w reaches 
states n— 1, n—2—j, and a different subset of {n— 2, n — 3, . . . , n— 2 — (j — 1)}. 
There are 2 3 such subsets. As j ranges from to n — 2, we get 2° + 2 1 + • • • + 2"~ 2 
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Fig. 3. Nondeterministic automaton of the reverse of a right ideal. 



different subsets. Adding the subset {n — 1} reached by e, we get 2 n_1 reachable 
subsets, each containing state n — 1. 

Let K = L R . The only state accepting da n ~ 2 is n — 1 reached by e. If 5* and 
T are two different subsets of {0, . . . , n — 1} reachable by u and u, respectively, 
and i £ S\T, then a"~ 2 ~* € K U \K V . Hence all the words in {e}U2 are pairwise 
distinguishable, and K = L R has 2" _1 distinct quotients. □ 

7 Left Ideals and Suffix-Closed Languages 

We provide strong support for the following conjecture about left ideals and 
suffix-closed languages: 

Conjecture 1 (Left Ideals and Suffix-Closed Languages). If L is a left 
ideal or a suffix-closed language with quotient complexity k(L) = n ^ 1, then its 
syntactic complexity is less than or equal to n n ^ 1 + n — 1. 

We show in this section that this complexity can be reached. Since every 
suffix-closed language other than S* is the complement of a left ideal and com- 
plementation preserves syntactic complexity, it suffices to consider only left ide- 
als. Before attacking the conjecture itself, we prove some auxiliary results. 

First we recall a result of Restivo and Vaglica [25] . Consider a semiautomaton 
S = (PU{0}, S, S), where is a sink state, meaning that 5(0, a) = for all a € S, 
and P is strongly connected. Such a semiautomaton is uniformly minimal if the 
automaton A = (PU{0}, £, S, qo, F) is minimal for every q e P and O^FCP. 

One can test whether a semiautomaton is uniformly minimal with the aid 
of the directed pair graph G = G(S) = (V,E). The vertices of G are all the 
unordered pairs (p, q) of states with p ^ q. There is an edge from (p, q) to (r, s) 
if and only if 5(p, a) = r and 5(q, a) = s for some a € S. Then S is uniformly 
minimal if and only if, for any pair (p, q), there is a path to (0, r) for some reP. 

Definition 2. Let n ^ 3, and let S n be the semiautomaton 

S n = ({0, ...,n-l},{a, b, c, d, e}, 6), 

where a = (1,2,..., n— I), b = (1, 2), c = (J 1 ^ 1 ), d = ( n ~ ), and e is the uniform 
transformation The state graph of S n is shown in Fig. 4- For n — 3 inputs 
a and b coincide; hence here we use S = {b, c, d, e}. 
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Fig. 4. Semiautomaton S n with n 



n-l 



+ n — 1 transformations. 



Definition 3. Let S' = S \ {e} and let lZ n be the semiautomaton lZ n = 
(Q,S',6'), where Q = P U {0}, P = {l,...,n— I}, and 6' is the restriction 
of 5 to Q x S' . Note that is a sink state oflZ n . 

Lemma 1. The set P is strongly connected and lZ n is uniformly minimal. 

Proof. Since a is a cycle of the states in P, lZ n is strongly connected. 

To show that TZ n is uniformly minimal, we construct the state-pair graph 
G = G(lZ n ) of lZ n , as in [25]. We need to show that for every vertex v in G, 
there is a path to a vertex of the form (0, j), where j G P. 

Assume that all the unordered pairs of distinct states of lZ n are represented 
as where i < j. If a vertex is of the form (0, j), then there is nothing to 

prove. If a vertex is of the form < i < j, then applying a™ -1- - 7 reaches 

(i + n— 1— j, n— 1). Then d takes the pair (i + n— 1 — j, n— 1) to (0, i + n— 1 — j). 
Consequently, lZ n is uniformly minimal. □ 

Theorem 6 (Left Ideals and Suffix-Closed Languages). For n ^ 3, let 

A n = (Q, E,8,0, F), where (Q,E,S) — S n of Def. 2, and F is any non-empty 
subset of Q\ {0}. Then A n is minimal, and the language L — L(A n ) accepted 
by A n is a left ideal and has syntactic complexity a(L) = n 11 ^ 1 + n — 1. 

Proof. Since semiautomaton lZ n is uniformly minimal, automaton A n is minimal 
for every choice of F. Hence L has n quotients. 

To prove that L is a left ideal it suffices to show that, for any w € L, we also 
have hw € L for every h e S. This is obvious if h G S \ {e}, since all transitions 
from state under h lead to state 0. If w £ L, then w has the form w = uev, 
where 6(0, u) — 0, 6(0, ue) — 1, and v e L e . But 6(0, eue) = 1, since 6(i, eue) = 1 
for all i € Q, and v € L e gives us euev = ew € L. Thus L is a left ideal. 

Consider any transformation t of the form 



where iu € {0, 1, 2, . . . , n — 2, n — 1} for 1 < k ^ n — 1; there are n" 1 such 
transformations. We have two cases: 
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1. If ik for all fc, 1 ^ fc ^ n — 1, then all the images of the last n — 1 states 
are in the set {1, . . . , n — 1}. By Theorem 2, t can be performed by A n - 

2. If ift = for some ft, 1 < ft < n — 1, then there exists some j, 1 < j ^ n — 1 
such that ik ^ j for all fc, 1 < k < n — 1. 

Define for all 1 < fc ^ n — 1 as follows: 

^ = f j, if ik = 0; 
fe \i fe , ifife^O. 

Let 

/fll 2 3 ••• n-3n-2n- l\ 

\0 *2 *3 ' ' ' *n-3 *ri-2 *ri-l / ' 

and r = (j, n — 1). By Theorem 2, s and r can be performed by A n . 

Now consider srdr. If t maps k to 0, then s maps fc to j, r maps j to n — 1, 
<i maps n — 1 to 0, and r maps to 0. If t maps fc to n — 1, then s maps fc to 
n — 1, r maps n — 1 to j, d maps j to j, and r maps j to n — 1. Finally, if t maps 
fc to an element other than or n — 1, then srdr maps fc to the same element. 
Hence we have t = srdr, and t can be performed by A n as well. 

Now consider any transformation t that maps all the states to some state 
j 7^ 0; there are n — 1 such transformations. We have two cases: 

1. If j = 1, then t = e; therefore £ can be performed by A n . 

2. Otherwise, let s — (1, j). By Theorem 2, s can be performed by A n - Since 
t = es, t can also be performed by A n as well. 

In summary, the syntactic complexity of L(A n ) is + n — 1. □ 

Since inputs a and & of automaton A3 coincide, we omit a. Table 3 shows the 
transition table of A3 and its 3 2 + 2 = 11 transformations. We will show that 11 
is indeed the maximal bound for n = 3, but we require more properties of left 
ideals. 

Table 3. The eleven transformations of automaton ^3 of a left ideal. 





b 


c 


d 


e 


bb 


bd 


cb 


db 


eb 


bdb 


cbd 














1 














2 








1 


2 


1 


1 


1 


1 





2 


2 


2 








2 


1 


1 





1 


2 


1 


2 





2 


2 






Let *4 = (Q, E, 8, qo, F) be the quotient automaton of a left ideal. For every 
word 10 £ consider the sequence qo — po 7 pi,P2 ■ ■ ■ of states obtained by 
applying powers of w to the initial state qo, that is, let pi — 8(qo,w t ). Since 
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A has n states, we must eventually have a repeated state in that sequence, 
that is, we must have some i and j > i such that p ,pi, ■ ■ ■ ,Pi,Pi+i, . . -Pj-i are 
distinct and p., = pi. The sequence go = Po,Pi, ■ ■ ■ ,Pi,Pi+i, ■ ■ -Pj-i 01 states with 
Pi = Pi is called the behavior of w on A, and the integer j — z is the period of 
that behavior. We will use the notation (po,pi, ■ ■ ■ ,pi,pi+\, . . .pj-i;pj = pi) for 
such behaviors. If the period of w is 1, then its behavior is aperiodic] otherwise, 
it is periodic. 

Lemma 2. If A is the quotient automaton of a left ideal L, then the behavior of 
every word w G E* is aperiodic. Moreover, L does not have the empty quotient. 

Proof. Suppose that w has the behavior (q = po,Pi, ■ ■ ■ ,Pi,Pi+i, ■ ■ -Pj-i;Pj = 
Pi), where j — i ^ 2; then j — 1 ^ i + 1. Since A is minimal, states pi and 
Pj-i must be distinguishable, say by word x G S*. If w l x G L, then w^ 1 x = 
w l u)i~ l ~ 1 x = wi~ l ~ 1 {w l x) £ L, contradicting the assumption that L is a left 
ideal. If w^ 1 x G L, then w^x — w{w : >~ 1 x) £ L, again contradicting that L is a 
left ideal. 

For the second claim, we know that a left ideal is non-empty by definition. 
So suppose that w G L. If L has the empty quotient, say L x = 0, then xw £ L, 
which is a contradiction. □ 

Example 2. Note that the conditions of Lemma 2 are not sufficient. For S = 
{a, b}, the language L = b U S*a satisfies the conditions, but is not a left ideal 
because b G L but ab £ L. Its quotient automaton is shown in Fig. 5. 

If the accepting state is 2 instead of 1, the language becomes V = SS*b = 
E*Eb, which is a left ideal. The languages L and L' have the same syntactic 
semigroup, but one is a left ideal while the other is not. 



b 




Fig. 5. Automaton of a language that is not a left ideal. 



Proposition 2. The number of transformations ruled out by Lemma 2 is 

t ( • Z J) W - m.l - 1) n-' = ± V - 1) n-'. (7) 

Proof. Consider a behavior (po,pi, ■ ■ ■ ,Pi,Pi+\, . . .pj-i;pj = pi) of length j. The 
first state, po, must be 0, but the set {p\, . . -Pj-i} can be any subset of cardi- 
nality j — 1 of the remaining n — 1 states, and there are (™Zi) sucn subsets. The 
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states in each subset can be arranged in any order, giving (j — 1)! permutations. 
Then there are j — 1 choices for pj . Finally, n — j states that are not part of the 
behavior can have n transformations each, adding the factor n n ~K □ 

Lemma 2 provides an upper bound to the syntactic complexity of left ideals, 
as shown in Table 4. However, there is a large gap between this bound and the 
bound we can achieve, and we know that this bound cannot be reached for n = 3. 



Table 4. Number of transformations ruled out by Lemma 2. 



n 


2 


3 


4 


5 




n n 


4 


27 


256 


3, 125 




ruled out by lemma 


1 


10 


162 


1,556 




an upper bound 


3 


17 


94 


1,569 




(n- l) n - L +n- 1 


O 


11 


67 


629 





Theorem 7 (Small Left Ideals and Suffix-Closed Languages). If 1 < 

n < 3 and L is a left ideal or a suffix-closed language with n{L) = n, then 
cr{L) < n n_1 +n— 1. Moreover, the bound is tight for n = 1 if\£\ ^ 1, for n = 2 
if \E\ > 3, and for n = 3 if \S\ > 4. 

Proof. We consider the three values of n separately. The bounds claimed below 
have all been verified by a computer program. 

n=l: Here, there is only one type of left ideal, L — S*. Thus the bound 1 holds, 
and is met by a* over U = {a}. 

ri— 2: There is only one periodic behavior (po = 0,Pi — 1;P2 = Po); hence only 
transformation [1,0] is ruled out by Lemma 2. Thus the bound 3 holds. 

Now consider any left ideal L with n = 2. State 1 must be reachable from 
state 0, say by input a. By Lemma 2, we cannot have a : [1,0], and so we 
have a : [1,1]. 

If S = {a}, then we have the left ideal L = a*a with <r(L) = 1. 
Thus er(L) = 1 if \E\ = 1. 

If 2~7 = {a, &}, then we have three cases: 

1. If & : [1, 1], then L = S* S with a(L) = 1. 

2. If 6 : [0, 0], then L = S*a with <r(L) = 2. 

3. If 6 : [0, 1], then L = S*aS* with a(L) = 2. 
Thus cr(L) < 2 if = 2. 

If 17 = {a, b, c}, the language L = S*a(a + b)* meets the bound 3. 
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n=3: For \S\ = 1, there is only one left ideal, namely L = U*aa, and it has 
a(L) = 2. 

For 1 17 1 = 2, we have verified that the number of transformations is at most 
7, and the automaton with inputs a : [001] and b : [122] meets this bound. 

For 1 17 1 = 3, we have verified that the number of transformations is at most 
9, and the automaton A 3 of Theorem 6 restricted to inputs b : [0,2,1], 
d : [0, 1, 0] and e : [1, 1, 1] meets this bound. 

Now consider the case \E\ — 4. For n = 3, there are three types of periodic 
behaviors: (j>o,Pi;P2 =Po), (f>o , Pi , f>2 ; f>3 =Po), and (po,Pi,P2',P3 =Pi)- The 
following ten transformations are ruled out by Lemma 2: [1,0,0], [1,0,1], 
[1,0,2], [1,2,0], [1,2,1], [2,0,0], [2,1,0], [2,2,0], [2,0,1], and [2,2,1]. 
There are six transformations that are not ruled out by Lemma 2 and that 
do not appear in Table 3, namely: [1,1,0], [1,1,2], [1,2,2], [2,0,2], [2,1,1], 
and [2, 1, 2]. Each of these transformations, when followed by a transforma- 
tion from Table 3, results in a transformation ruled out by Lemma 2: 
h : [1, 1, 0] and cb : [0, 2, 2] yield t t cb : [2, 2, 0], 
ti : [1, 1, 2] and db : [0, 2, 0] yield t 2 db : [2, 2, 0], 
t 3 : [1,2,2] andd: [0,1,0] yield t 3 d : [1,0,0], 
U : [2, 0, 2] and c : [0, 1, 1] yield t 4 c : [1, 0, 1], 
t 5 : [2, 1, 1] and bdb : [0, 0, 2] yield t 5 bdb : [2, 0, 0], 
t 6 : [2, 1, 2] and bd : [0, 0, 1] yield t 6 bd : [1, 0, 1]. 

All these conflicts are independent of the set of accepting states. Further- 
more, each transformation not ruled out by Lemma 2 conflicts with a dif- 
ferent transformation from Table 3. So at most one transformation can be 
chosen from each pair, showing that there cannot be more than 11 transfor- 
mations for any automaton with three states. Hence the syntactic complexity 
of any left ideal with quotient complexity 3 is at most 11, and the example 
of Table 3 shows that this bound is tight. □ 

Table 5 summarizes our results concerning left ideals. The figures in bold 
type are tight upper bounds. The other complexities are achievable, but we have 
no proof that they are upper bounds. In general, there are many solutions with 
the same complexity. 

The complexity 17 for n — 4, \S\ = 2 is reached with the inputs a : [1, 2, 3, 3] 
and b : [0,0,1,2]. The complexity 25 for n = 4, \S\ = 3 is met by _4 4 of 
Theorem 6 restricted to a, d, e. The complexity 64 for n = 4, \S\ =4 is met by 
Aa of Theorem 6 restricted to a, c, d, e. 

The complexity 34 for n = 4, \S\ = 2 is reached with the inputs a : 
[1, 2, 3, 4, 4] and b : [0, 0, 1, 2, 3]. The complexity 65 for n = 5, \S\ = 3 is met by 
A5 of Theorem 6 restricted to a, d, e. The complexity of 453 for n = 5, |17| = 4 
is met by A5 of Theorem 6 restricted to a,c,d,e. 

As was the case with right ideals, for our left ideal with maximal syntactic 
complexity, the reverse language has maximal state complexity, as the next result 
shows. This time, however, we require an alphabet of four letters. 
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Table 5. Syntactic complexities for left ideals. 





n = 1 


n = 2 


n = 3 


n = 4 


n — 5 




n = n 


\S\ = 1 


1 


1 


2 


3 


4 




n 1 


\S\ = 2 




2 


7 


17 


34 






\Z\ = 3 




3 


9 


25 


65 






\E\=4 






11 


64 


453 






\U\ = 5 








67 


629 







Theorem 8 (Reverse of Left Ideal). The reverse of the left ideal accepted 
by automaton A n of Theorem 6 restricted to {a, c, d, e} has 2 n ~ 1 + 1 quotients, 
which is the maximum possible for a left ideal. 

Proof. Consider the subset construction applied to the nondeterministic automa- 
ton of Fig. 6. First we show that the subset Q — {0,1, ... , n—1} and all subsets of 
P = Q\{0} are reachable. The word a n ~ 2 e reaches Q, and the word (a n ~ 2 c)™~ 2 
reaches P. Now suppose we have a set S of k elements, S = {ii, 12, ■ ■ ■ , ife}, 
where {1 < i\ < i 2 < ■ ■ ■ < ik ^ n — 1}. To delete the jth element of S apply 
a %3 da n ~ x ~ li . Hence all subsets of P can be reached. 

Note now that a l ~ 1 e is accepted only from state i, for i = 1, . . . , n — 1, and 
the empty word is accepted only from state 0. It follows that all subsets of P 
are pairwise distinguishable. □ 




Fig. 6. Nondeterministic automaton of the reverse of a left ideal. 



8 Two-Sided Ideals and Factor-Closed Languages 

We now consider two-sided ideals and factor-closed languages. We provide sup- 
port for the following conjecture: 
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Conjecture 2 (Two-Sided Ideals and Factor-Closed Languages). If L 

is a two-sided ideal or a factor-closed language with quotient complexity k(L) = 
n > 2, then it has syntactic complexity cr(L) ^ n n ~ 2 + (n — 2)2"~ 2 + 1. 

We show in this section that this complexity can be reached. Since every 
factor-closed language other than S* is the complement of a two-sided ideal, 
and complementation preserves syntactic complexity, it suffices to consider only 
two-sided ideals. 

For n = 1, the bound of the conjecture does not apply. The only two-sided 
ideal is L = £*, and it has a(L) = 1. 

For n = 2 and S = {a, &}, the only two-sided ideal is L = S*aS*, and it has 
cr(L) = 2, which is the bound of the conjecture. 

For n = 3 and S = {a, 6, c}, the automaton with inputs a : [1, 2, 2], b : [0, 0, 2], 
and c : [0, 1, 2] has cr(L) = 6, which is the bound of the conjecture. 

Definition 4. Let n ^ 4, and let A n be the automaton 

A n = ({0,...,n- l},{a,b,c,d,e,f},5,0,{n- 1}), 

where a = (1, 2, . . . , n- 2), b = (1,2), c = (™7 2 ), d= ( n ~ 2 ) , for i = 0, . . . , n - 2, 
d(i,e) = 1 and 5{n — l,e) = n — 1, and / = (^ij. The state graph of A n is 
shown in Fig. 7. For n = 4, inputs a and b coincide. 

a, b, c, d, e, f 

a, b, c,d, f T> c, d, f b, c, d, f b, c, d, f b, f 





x e 


f N a, 6 f "\ a 




X 




1 V—M 2 J— « 


/ 3 Y 



c, rf, e \ 



a, c, e 



Fig. 7. Automaton An of a two-sided ideal with n™ 2 + (n — 2) 2 n 2 + 1 transformations. 

Theorem 9 (Two-Sided Ideals and Factor-Closed Languages). Automa- 
ton An of Fig. 7 is minimal and the language L = L(A n ) accepted by A n is a 
two-sided ideal and has syntactic complexity cr(L) = n n ~ 2 + (n — 2)2"~ 2 + 1. 
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Proof. For i = 1, ...,n — 2, state i is the only non-final state that accepts 
a n ~ 1 ~ l f\ hence all these states are distinguishable. State is distinguishable 
from these states, because it does not accept any words in a*f. Hence A n is 
minimal. The proof that A n is a left ideal is like that in Theorem 6. Since 
L e f = S* is the only accepting quotient, L is a right ideal. Hence it is two-sided. 
Consider any transformation t of the form 



where ik G {0, 1, 2, . . . , n — 2, n — 1} for 1 < k ^ n — 1; there are n n 2 such 
transformations. We have two cases: 

1. If ik 7^ n — 1 for all k, 1 ^ k ^ n — 2, then all the images of the first n — 2 
states are in the set {0, . . . , n — 2}. By Theorem 2, t can be done by A n . 

2. If ih = n— 1 for some h, 1 ^ h ^ n — 2, then there exists some j, 1 ^ j ^ n— 2 
such that ik ^ j for all k, 1 ^ fc ^ n — 2. 

Define for all 1 < fc ^ n — 2 as follows: 



and r = By Theorem 2, s and r can be performed by A n . 

Now consider sr/r. If t maps fc to n — 1, then s maps k to j, r maps j to 1, 
/ maps 1 to n — 1, and r maps n — 1 to n- 1. Iff maps fc to 1, then s maps fc 
to 1, r maps 1 to j, f maps j to j, and r maps j to 1. Finally, if t maps fc to an 
element other than 1 or n — 1, then sr/r maps A: to the same element. Hence we 
have t = srfr, and t can be performed by A n as well. 

Refer to states in {1, . . . , n— 2} as the middle states. Take any transformation 
t that maps to k e {1,. . .,n — 2}, and any middle state to either state in 
{i, n — 1}. There are (n — 2)2™ -2 such transformations. First consider any entry 
i that is mapped to n — 1 by t. We can map i to n — 1 without changing any 
other states. First, apply a™ _1_I 's to rotate all the middle states clockwise, so 
that i is mapped to 1, then apply / to map i to n — 1, and then a 1 to return all 
the states other than n — 1 to their original positions. This is repeated for all the 
states that are mapped to n— 1 by t. After this is done, apply e to replace all the 
middle states by 1, and apply a 1 ^ 1 to move 1 to i. Hence t can be performed. 

Finally, the constant transformation ( ^) is done by ef. 

In summary, the syntactic complexity of the language accepted by A n is at 
least n n - 2 + (n- 2)2 n ~ 2 + 1. 

Note that is mapped to a middle state 1 if and only if the input word 
contains an e. But every word of the form xe leaves the automaton in a state 
in {1, n — 1}. Applying any other word can only result in a state in {i, n — 1}, 





Let 
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for some middle state i. Hence no transformations other than the ones we have 
considered can be done by A n , and the syntactic complexity of the language 
accepted by A n is precisely n n ~ 2 + (n - 2)2™~ 2 + 1. □ 

Table 6 summarizes our results for two-sided ideals. For S = {a, &}, the 
values are reached by the languages £*a n ~ 1 £* for n ^ 2. For n — 4, \U\ = 3, 
the value 16 is reached by A4 restricted to {a, e, /}. For \U\ = 4, the value 23 
is reached by A4 restricted to {a, d, e,f}. For n = 5, \S\ = 3, the value 47 is 
reached by A5 restricted to {a, e, /}. For \U\ — 4, the value 90 is reached by A5 
restricted to {a, d, e, /}. For \S\ = 5, the value 90 is reached by A5 restricted to 
{a,c, d,e,f}. 



Table 6. Syntactic complexities for two-sided ideals. 
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\E\=6 










150 




n n -' z + (71-2)2™-^ + 1 



Our previous two results about reversal apply here as well. 

Theorem 10 (Reverse of Two-Sided Ideal). The reverse of the two-sided 
ideal accepted by automaton A n of Theorem 9 restricted to {a, d, e, /} has 2™~ 2 + 
1 quotients, which is the maximum possible for a two-sided ideal. 

Proof Consider the subset construction applied to the nondeterministic automa- 
ton of Fig. 8. Let P — Q \ {0, n — 1}. We will show that Q and all sets of the 
form {n — 1} U S, where S C P are reachable. First, Q is reached by fe. Also, 
5{{n - 1}, (/a)™" 3 /) = {n - 1} U P. To remove i, 1 s$ i ^ n - 2, from any 
set {n — 1} U S, apply a l ~ 1 d; this also rotates the remaining states of P to the 
left by i — 1 positions. Then apply a™ -2- ^ -1 ) to return the remaining states to 
their original positions. Hence all sets of the form {n — 1} U S are reachable. One 
verifies that all the 2™~ 2 + 1 subsets are pairwise distinguishable. □ 

Despite the fact that the Myhill congruence has left-right symmetry, there 
are significant differences between left and right ideals. The major open problem 
concerning ideals is to find a better upper bound for left ideals. Also, the relation 
between syntactic complexity and reversal deserves further study. 
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a,d,e,f 




Fig. 8. Nondeterministic automaton of the reverse of a two-sided ideal. 
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